ASAFE: ancestry-specific allele frequency estimation
نویسندگان
چکیده
UNLABELLED In a genome-wide association study (GWAS) of an admixed population, such as Hispanic Americans, ancestry-specific allele frequencies can inform the design of a replication GWAS. We derive an EM algorithm to estimate ancestry-specific allele frequencies for a bi-allelic marker given genotypes and local ancestries on a 3-way admixed population, when the phase of each admixed individual's genotype relative to the pair of local ancestries is unknown. We call our algorithm Ancestry Specific Allele Frequency Estimation (ASAFE). We demonstrate that ASAFE has low error on simulated data. AVAILABILITY AND IMPLEMENTATION The R source code for ASAFE is available for download at https://github.com/BiostatQian/ASAFE CONTACT: [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
Local Ancestry Inference in a Large US-Based Hispanic/Latino Study: Hispanic Community Health Study/Study of Latinos (HCHS/SOL)
We estimated local ancestry on the autosomes and X chromosome in a large US-based study of 12,793 Hispanic/Latino individuals using the RFMix method, and we compared different reference panels and approaches to local ancestry estimation on the X chromosome by means of Mendelian inconsistency rates as a proxy for accuracy. We developed a novel and straightforward approach to performing ancestry-...
متن کاملEstimating relationships between phenotypes and subjects drawn from admixed families
BACKGROUND Estimating relationships among subjects in a sample, within family structures or caused by population substructure, is complicated in admixed populations. Inaccurate allele frequencies can bias both kinship estimates and tests for association between subjects and a phenotype. We analyzed the simulated and real family data from Genetic Analysis Workshop 19, and were aware of the simul...
متن کاملExamining population stratification via individual ancestry estimates versus self-reported race.
Population stratification has the potential to affect the results of genetic marker studies. Estimating individual ancestry provides a continuous measure to assess population structure in case-control studies of complex disease, instead of using self-reported racial groups. We estimate individual ancestry using the Federal Bureau of Investigation CODIS Core short tandem repeat set of 13 loci us...
متن کاملTESS3: fast inference of spatial population structure and genome scans for selection.
Geography and landscape are important determinants of genetic variation in natural populations, and several ancestry estimation methods have been proposed to investigate population structure using genetic and geographic data simultaneously. Those approaches are often based on computer-intensive stochastic simulations and do not scale with the dimensions of the data sets generated by high-throug...
متن کاملBrief communication: Evolution of a specific O allele (O1vG542A) supports unique ancestry of Native Americans.
In this study, we explore the geographic and temporal distribution of a unique variant of the O blood group allele called O1v(G542A) , which has been shown to be shared among Native Americans but is rare in other populations. O1v(G542A) was previously reported in Native American populations in Mesoamerica and South America, and has been proposed as an ancestry informative marker. We investigate...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 32 14 شماره
صفحات -
تاریخ انتشار 2016